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tion of milk and the semen which includes the desired a-lactalbumin protein DNA sequence. 
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DMA SEQUENCE ENCODING BOVXNE a-LACT2^DHIN 
AMD METHODS OF USE 

FIELD OF THE INVENTION 

The present invention relates generally to a 
DNA sequence encoding bovine a-lactalbximin and to methods 
of producing proteins including recombinant proteins in 
the milk of lactating genetically engineered or 
transgenic mammals. The present invention relates also 
to genetically engineered or transgenic mammals that 
secrete the recombinant protein. The present invention 
is also directed to a genetic marker for identifying 
animals with superior milk producing characteristics. 

REFERENCE TO CITED ART 

Reference is made to the section preceding 
the CLAIMS for a full bibliography citation of the art 
cited herein. 

DESCRIPTION OF THE PRIOR ART 
CE-Lactalbumin is a major whey protein found 
in cow's milk f Eiael et al. . 1984). The term "whey 
protein" includes a group of milk proteins that remain 
soluble in "milk serum" or whey after the precipitation 
of casein, another milk protein, at pH 4.6 and 20*>C. a- 
Lactalbumin has these characteristics. 

a-Lactalbumin is a secretory protein that 
normally comprises about 2.5% of the total protein in 
milk. a-Lactalbumin has been used as an index of mammary 
gland function in response to hormonal regulation in 
bovine explant culture f Akers et al , . 1981; Goodman et 
al. , 1983) and as an index of udder development (McFadden 
et al . , 1986) . a-Lactalbumin interacts with galactosyl 
transferase and therefore plays an essential role in the 
biosynthesis of milk sugar lactose ( Brew> K. and R.L. 
Hill , 1975) . Lactose is an important component in milk, 
and contributes to milk osmolality. It is the most 
constant constituent in cow's milk ( Larson , 1985). a- 
Lactalbumin is useful as an index of lactogenesis in 
culttired mammary tissue ( McFadden et al. , 1987) . It is 
therefore- believed that a-Lactalbumin is an important 



protein in controlling milk yield and can be used as an 
indicator of mammary function. 

The expression of bovine a-lactalbumin may be 
a potential rate limiting process in dairy cattle. If 
greater expression of the a-lactalbumin gene can be 
obtained, then more milk and milk protein could be 
produced. In other words, a-lactalbumin is a potential 
Quantitative Trait Locus (QTL) . 

SUMMARY OF THE INVENTION 

One object of the present invention is to 
detect possible genetic differences in the expression of 
bovine a-lactalbumin. 

Another object of the present invention is to 
provide a DNA sequence encoding a mammary specific bovine 
a-lactalbumin protein having a specified nucleotide 
sequence^ 

It is also an object of the present invention 
to provide a method for genetically engineering the 
incorporation of one or more copies of a construct 
comprising an a-lactalbumin control region, which 
construct is specifically activated in the mammary 
tissue. 

These objects and others are addressed by the 
present invention, which is directed to a DNA sequence 
encoding bovine a-lactalbumin having a specified 
nucleotide sequence . 

The present invention is also directed to an 
expression vector comprising this DNA sequence. Further,- 
the present invention is directed to the protein a— 
lactalbumin having the nucleotide sequence. 

The present invention is also directed to an 
expression system comprising a mammary specific a- 
lactalbumin control region which, when genetically 
incorporated into a mammal, permits the female species of 
that mammal to produce the desired recombinant protein in 
its milk. 
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The present invention is also directed to a 
genetically engineered or transgenic maminal comprising 
the specified DNA sequence encoding bovine a-lactalbumin. 

The present invention is also directed to a 
5 DNA sequence coding for a-lactalbumin^ which is 

operatively linked to an expression system coding for a 
msunmary-specif ic a-lactalbumin protein control, or any 
control region which specifically activates a-lactalbximin 
in milk or in mammary tissue, through a DNA sequence 

10 coding for a signal peptide' that permits secretion and 

maturation of the a-lactalbumin in the mammary tissue. 

The present invention is also directed to a 
process for genetically engineering the incorporation of 
one or more copies of a construct comprising an a- 

15 lactalbumin control region which specifically activates 

a-lactalbumin in milk or in mammary tissue. The control 
region is operatively linked to a DNA sequence coding for 
a desired recombinant protein through a DNA sequence 
coding for a signal peptide that permits the secretion 

20 and maturation of a-lactalbumin in the mammary tissue. 

The present invention is also directed to a 
process for the production and secretion into a mconmal's 
milk of an exogenous recombinant protein. The steps 
include producing milk in a genetically engineered or 

25 transgenic m6unmal. The milk is characterized by an 

expression system comprising a-lactalbumin control 
region. The control region is operatively linked to an 
exogenous DNA sequence coding for the recombinant protein 
through a DNA sequence coding for a signal for the 

30 peptide effective in secreting and maturing the 

recombinant protein in mammary tissue. The milk is then 
collected for use. Alternatively, the exogenous 
recombinant protein is isolated from the milk. 

The present invention is also directed to a 

35 selection characteristic for identifying superior milk 

and milk protein producing animals comprising a DNA 
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seguence encoding bovine a-lactalbumin and having a 
specified nucleotide sequence. 

The present invention is also directed to a 
selection characteristic for identifying superior miUc 
5 and milk protein producing mammals. The mammals are 

characterized by inherited genetic material in the DNA 
structure of the mammal. The genetic material encodes at 
least one desired dominant selectable marker for bovine 
a-lactalbumin. One such marker is adenosine, which is 

10 located at the -13 position on the control region of the 

DNA sequence for a-lactalbumin- The present invention is 
also directed to a method of predicting superior milk emd 
milk protein production in animals comprising identifying 
the selection characteristic discussed above. 

15 The present invention is further directed to 

a method -for modifying the milk composition in mammals 
which comprises inserting a DNA sequence encoding bovine 
a-lactalbumin having a specified nucleotide sequence. 

The DNA sequence and the various methods of 

20 using it have potentially beneficial uses for dairy 

farmers^ artificial insemination organizations, genetic 
marker companies, and embryo transfer and cloning 
companies , to name a few . 

The uses for this genetic marker include the 

25 identification of superior nuclear transfer embryos and 

the identification of superior embryos to clone. 

The present invention also will aid in the 
progeny testing of sires. The specified DNA sequence can 
be used as a genetic marker to identify possible elite 

30 sires in terms of milk production and milk protein 

production. This will increase the reliability of buying 
superior dairy cattle. 

The present invention also will provide 
assistance in farm management decisions, such as sire 

35 selection and selective culling. The physiological 

markers assist in determining future production 
performance in addition to a cow's pedigree. From this 
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information^ one could buy or retain a heifer with a DNA 
sequence encoding a-lactalbumin of the present invention 
and consider culling a heifer without the proper 
sequence . 

5 BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a schematic illustration of a 
partial restriction map of the bovine a-lactalbumin of 
the present invention. The sequence contains 2.0 
kilobases of a 5* flanking region, 1.7 kilobases of a 

10 coding region and 8.8 kilobases of a 3 ' flanking region. 

Digestion with the Hpa I yields a 2.8 kilobase fragment 
containing the whole 5» flanking region. 

Fig. 2 depicts in schematic outline a map of 
the plasmid A- lac Pro/pIC 2 OR. A Hpa X fragment of the 

15 genomic clone was inserted into the EcoRV site of plC 

2 OR. The Hpa I fragment contains 2.1 kb of 5» flanking 
DNA the signal peptide coding region of a— lactalbxmin and 
8 bases encoding the mature a-lactalbumin protein. Six 
unique enzyme sites are available for attaching various 

20 genes to the sequence. 

Fig. 3 is a schematic illustration of a 
detailed map of the a-lactalbumin 5* flanking control 
region cloned in EcoRV site of the plasmid pIC 20R (SEQ 
ID NO:l, SEQ ID NO:2). 

25 Fig. 4 is a schematic illustration of a 

detailed map of the 8.0 kilobase Bglll fragment. 

Fig. 5 depicts the nucleotide sequence (SEQ 
ID NO; 3) of the control/ enhancer region of the bovine a- 
lactalbumin protein. 

30 Fig. 6 depicts in schematic outline a map of 

a plasmid containing bovine a-lactalbumin-bovine B-casein 
gene construct. 

Fig. 7 illustrates a sequence comparison 
between humans and bovine genes in the 5* flanking region 

35 of the bovine a-lactalbumin protein between the present 

invention U, bovine sequence (SEQ ID NO: 4), a human 
sequence (SEQ ID NO: 5) and the French bovine (SEQ ID 



NO: 6) for the putative steroid response element and 
between the present invention U. S. bovine sequence (SEQ 
ID NO: 7), a human sequence (SEQ ID NO: 8) and the French 
bovine (SEQ ID NO: 9) for the RNA polymerase binding 
region, surrounding three of the four nucleotide sequence 
variant mutations. 

Fig. 8 is a DOTPLOT" graph comparing the 
bovine a-lactalbumin 5* flanking seqpience to the saune 
region of the human a-lactalbumin sequence. 

Fig. 9 is a DOTPLOT" graph comparing the 
bovine a-lactalbumin 5' flanking sequence to the same 
region of the guinea pig a-lactalbumin sequence. 

Fig. 10 is a DOTPLOT" graph comparing the 
bovine a-lactalbumin 5* flanking sequence to the same 
region of the rat a-lactalbumin sequence. 

Fig. 11 is a graph illustrating expression 
levels observed in each of three a-lactalbumin trajisgenic 
mouse line. 

Fig. 12 is a 4% NuSieve autoradiographic gel 
of Mnll digested PGR products. 

Fig. 13 is a graph illustrating a scatter 
plot of each data point in Fig. 12 as well as mean values 
for each of the three genotypes. 

DETAIIi DESCRIPTION OF THE PREFERRED INVENTION 

In the Description the following terms are 

employed: 

Genetic engineering, manipulation or 
modification: the formation of new combinations of 
materials by the insertion of nucleic acid molecules 
produced outside the cell into any virus, bacterial 
plasmid or other vector system so as to allow their 
incorporation into a host organism in which they do not 
naturally occur, but in which they are capable of 
continued propagation at least throughout the life of the 
host organism. Although the term incorporates transgenic 
alteration, the manipulation of the genomic sequence does 
not have to be permanent, i. e., the genetic engineerinq 



can affect only the animal which was directly 
manipulated • 

Transgenic animals: permanently genetically 
engineered animals created by introducing new DNA 
sequences into the germ line via addition to the egg. 

It is within the scope of the present 
application to use any mammal for the invention. 
Examples of mammals include cows, sheep, goats, mice, 
oxen, camels, water buffaloes, llamas and pigs. 
Preferred mammals include those that produce large 
voltimes of milk and have long lactating periods. 

The present invention is directed to a gene 
which encodes bovine a-lactalbumin. This gene has been 
isolated and characterized. The 5» flanking region of 
the gene has been cloned into six vectors for use as a 
mammary specific control region in the production of 
genetically engineered mammals. To better xinderstand the 
regulation of this control region, 2.0 kilobases of the 
5» flanking sequence have been sequenced. The a- 
lactalbumin 5' flanking sequence serves as a useful 
mammary-specific "control /enhancer complex" for 
engineering genetic constructs that could be capable of 
driving the expression of novel and useful proteins in 
the milk of genetically engineered or transgenic mammals. 
This results in an increase in milk production and the 
protein composition in milk, a change in the milk and/ or 
protein composition in milk, and the production of 
valuable proteins in the milk of genetically engineered 
or transgenic maunmals. Such proteins include insulin, 
growth hormone, growth hormone releasing factor, 
somatostatin, tissue plasminogen activator, tumor 
necrosis factor, lipocortin, coagulation factors VIII and 
IX, the interferons, colony stimulating factor, the 
inter lukens, urokinise, industrial enzymes such as 
cellulases, hemicellulases, peroxidases, and thermal 
stable enzymes. 



The a-lactalbuiain gene is the preferred gene 
for use in the process because it is a mammary specific 
protein 5' control region. It also exerts the tightest 
lactational control of all milk proteins. Further, it is 
independently regulated from other milJc proteins and is 
produced in large quantity by lactating animals. 
Total Secruence 

A gene encoding the milk protein bovine 
Qt-lactalbumin was isolated from a bovine genomic library 
(Woychik, 1982). The Charon 28 lambda library was probed 
using a bovine or-lactalbumin cDNA (Hurley, 1987) and a 
770 base pair a-lactalbumin polymerase chain reaction 
product. The positive lambda clone includes 12.5 
kilobases of inserted bovine sequence, consisting of 2.0 
kilobases of a 5 » flanking (control /enhancer) region, a 
1.7 kilobase coding region and 8.8 kilobases of a 3 • 
flanking region. A partial restriction map of the clone 
is illustrated in Fig. 1. 

A 2.8 kilobase Hpa I fragment including the 
2.0 kilobase control region along with the signal peptide 
coding region was cloned into the EcoRV site of the 
plasmid pic 2 OR. The plasmid is illustrated in schematic 
outline in Fig. 2. 

An 8.0 kilobase Bgl II fragment containing a 
2.0 kilobase 5' flanking control region, a 1.7 kilobase 
coding region, 3.0 kilobases of a 3 ' flanking region, 1.2 
kilobases of a lambda DNA has also been isolated. 
Reference is made to figure 4 for a map of the 8.0 
kilobase fragment. Transgenic mice have been produced 
using the Bgl II fragment. 
Control /Enhancer Region 

The 2 . 0 kilobase 5 » flanking region has been 
cloned into the vectors Pic 2 OR and Bluescript KS+. A 
schematic illustration of the a-lactalbumin 5» flanking 
control region cloned in the EcoRV site of pic 2 OR is 
depicted in Figs. 2 and 3 (SEQ ID NO:l, SEQ ID NO: 2). 



The constructs multiple cloning site, which 
exists downstream of the signal peptide coding region, 
permits various genes to be attached to the a-lactalbumin 
control region. Thus, this vector allows for easy 
attachment of specific coding sequences of genes. It 
contains all elements necessary for expression of 
proteins in milk, i.e., a mammary specific control 
region, a mammary specific signal peptide coding region 
and a mattire protein-signal peptide splice site which is 
able to be cleaved in the mammary gland. The vector also 
contains many unique restriction enzyme sites for ease of 
cloning. Attachment of genes to this control region will 
allow for mammary expression of the genes when these 
constructs are placed into mammals. These vectors also 
contain the a-lactalbumin signal peptide coding sequence 
which will allow for proper transport of the expressed 
protein into the milk of the lactating mammal. 

The control region construct has driven 
mammary expression of a desired protein in transgenic 
mice. Bovine a-lactalbumin levels of greater than 1 
mg/ml have been observed in the milk of transgenic mouse 
lines as described in Example 2 f infra . ) . Constructs 
containing the 2.0 kilobase region attached to the bovine 
B-casein gene (Bonsing, J., et al., 1988) as well as the 
bacterial reporter gene chloramphenicol acetyl 
transferase have been produced in our lab. Fig. 6 is a 
schematic representation of a plasmid containing the 
bovine a-lactalbumin bovine B-casein gene construct. The 
genomic DNA sequence containing the bovine B-casein gene 
was attached to the 5' flanking sequence of the bovine a- 
lactalbumin 5' flanking sequence. The vector contains 
the polyadenylation site of 6-casein along with 
approximately 100 base pairs of 5» flanking DNA. The 100 
base pairs of 5» flanking DNA is attached to the bovine 
a-lactalbumin 5' flanking region at the -100 position. 
The construct uses the proximal promoter elements of B- 
casein and the distal control region elements of a- 
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lactalbiimin. The B-casein construct has been used to 
produce transgenic mice as is illustrated in the 
examples. 

To understand the control of the 
5 control/ enhancer region the 2.0 kilobases of 5' flanking 

region were sequenced. A single strand copy of the 
sequence is listed in Fig. 5 (SEQ ID NO: 3). The sequence 
is listed 5 • to 3 • with the signal peptide coding region 
underlined. 

10 Regulato ry Sequences 

Potential regulatory sequences contained 
within the 5 '-flanking region of bovine a-lactalbumin 
have been identified. There are possible regulatory 
regions in the introns as well as in the 3« flanking 

15 region. Portions of the suspected control regions were 

examined- for possible sequence differences in the 
population which might be related to milk and milk 
protein production of individual cows. The differences 
in the regulatory regions of a-lactalbumin are expected 

20 to lead to differences in expression of a-lactalbumin 

mRNA. The increased cellular content of mRNA will 
increase the expression of a-lactalbumin protein with a 
concomitant increase in lactose synthase resulting, 
ultimately, in a milk and milk protein production 

25 increase. This type of mechanism would be considered a 

major gene effect on milk and milk protein production by 
a-lactalbumin. The changes are viewed as causally-linked 
to changes in milk and milk protein production and not 
correlatively-linked. Correlative ly- linked traits are 

30 those which are closely associated with an iinknown 

genetic loci which has the direct impact on the 
quantitative trait. 

Sequence differences between the U. S. 
Holstein and the French cow fVilotte. et al. , 1987) of an 

35 unknown breed were found at four positions within the 5* 

flanking region. One of the identified sequences has a 
sequence which would indicate that it was a steroid 
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honnone response element. Two other differences were 
noted in the RNA polymerase binding region and a fourth 
in the signal peptide coding region of the gene. Because 
of the relationship between these sequences and Icnown 
control sequences of mammalian genes, all the variations 
occur in regions one would expect to be involved in 
regulation of the amount of mRNA produced. Further, 
genetic variations which occur in factors binding to 
these regions would also be expected to cause changes* 

Fig. 7 illustrates sequence variants observed 
in the 5" flanking region between the present invention 
U. S. bovine, hxunan (Hall et al. , 1987) and the French 
bovine (Vilotte, 1987) for the putative steroid response 
element (SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 
respectively) and for the RNA polymerase binding region 
(SEQ ID NO: 7, SEQ ID NO: 8, and SEQ ID NO: 9 respectively). 
All of the differences occxir in highly conserved portions 
of the gene as seen by comparing this region to the same 
region of the human a-lactalbumin gene. Fig. 7 also 
shows that the positions where the bovine genes differ 
are the same positions the human gene differs from the 
bovine. These data indicate that the bases are part of a 
potentially important control region. 

A method has been devised to give a clearcut 
differentiation between two of the variants at a position 
-13 bases from the start of transcription, i. e., 13 base 
positions from the signal peptide coding region. The two 
variants are termed (a-Lac (-13) A) and (a-Lac (-13) B) . 
The a-lac (-13) A genotype is adenine base at position 
-13 the a-lac (-13) B genotype is either a guanine, 
thymine or cytosine base at -13. They can be 
differentiated with a simple restriction enzyme digest of 
an amplified polymerase chain reaction (PGR) product 
using a specific restriction enzyme (Mnll) . Because of 
the specificity of the restriction enzyme Mnll, the 
restriction analysis is unable to distinguish between 
these different possibilities. The a-lac (-13) A allele 
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contains an extra Mnll site at position -13 giving the 
smaller band observed on the gel. 

To amplify the appropriate region of DNA, 
oligonucleotides which frame the sequence of interest 
were synthesized. These oligonucleotides were chosen 
because of their specific chemical characteristics. 
These oligonucleotides were then used in a polymerase 
chain reaction to amplify the framed portion of the 
a-lactalbumin gene. The oligonucleotides have the 
following sequences: 
a-lac Seq. 1 (SEQ ID NO: 10) 

5'ACGCTTGTAAAACGACGGCCAGTTGATTCTCAGTCTCTGTGGT 3' 
a-lac Seq. 2 (SEQ ID NO: 11) 

5 'AGCATCAGGAAACAGCTATGACCTGGGTGGCATGGAATAGGAT 3 • 

Restriction fragment analysis (Seimbrook, J. 
et al., 1989) was used to examine animals from a number 
of breeds of cattle. In most breeds, neunely, Jersey, 
Guernsey, Brown Swiss, Simmental and Brahman, only one of 
two genotypes is found. This is the a-lac (-13) B 
genotype. However, in the most popular and highest milJc 
producing breed of cattle, the Holstein, two genotypes 
occur at this position. The frequency of the A genotype 
was 27% in random samples, while the frequency of the B 
genotype was 73%. Holsteins contain both the genotype 
found in the other breeds as well as a separate distinct 
genotype which appears to have arisen within the last 
thirty years in the U. S. Holstein population as 
determined by examining pedigrees of sires currently in 
use. It appears that this genotype has unknowingly been 
selected for using traditional animal selection. 
Homozygous and heterozygous animals are found within the 
Holstein population. 

The genotype (a-lac (-13)) has been examined 
for its correlation with milk and milk protein 
production. The three additional variations are being 
examined to determine the frequency of their differences 
in the cattle population and their correlation with milk 
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and milk protein production. The possible linkage of 
these genotypes is also being examined using DNA 
seqtuencing. The goal of this technology is to identify 
the optimal regulatory genotype for a-lactalbumin and to 
5 select animals with those particular characteristics. 

Detection and Selection of Four Genetic Variants 

The region of sequence where the a-lac (-13) 
variation occurs can be amplified using the polymerase 
chain reaction (PCR) (Sambrook et al., 1989) and two of 
10 the following primers which were developed. Each primer 

allows for amplification of a specific portion of the 
a-lactalbumin gene. Combinations of the listed primers 
can be used in between any two of the primer locations 
listed below. 

15 Primer No.\ Primer sequence Primer location 

(SEQ ID NO:) (From translation 

start site) 

1 \ (12) 5« CTCTTCCTGGATGTAAGGCTT 3» (-120) - (-100) 

2 \ ( 13 ) 5 » TCCTGGGTGGTCATTGAAAGGACT 3'(-2000)-(-1975) 
20 3 \ (14) 5» CAATGTGGTATCTGGCTATTTAGTG 3» (-717) - (--692) 

4 \ (15) 5» AGCCTGGGTGGCATGGAATA 3« (+53) -(+33) 

5 \ (16) 5» GAAACGCGGTACAGACCCCT 3" (+453 ) - (+433 ) 

After amplification of the specific region, 
the DNA is either sequenced or digested with restriction 

25 enzymes to detect the sequence differences. In the case 

of the a-lac (-13) variation, the sequence difference can 
be seen using the restriction enzyme Mnll (5»CTCC 3» 
recognition site) . The PCR DNA product is digested with 
Mnll and then run on a 4% NuSieve agarose gel to observe 

30 the polymorphism. 

A 650 base pair sequence containing all four 
of the variations is being examined using a unique 
sequencing technique. PCR is initially used to amplify a 
770 base pair portion of the a-lactalbumin 5' flanking 

35 region. Another PCR reaction is then performed using a 

portion of the initial reaction and the following primers 
(SEQ ID NO: 10 and SEQ ID NO: 11 respectively): 

a- lac aeg. 1 5 • ACGCTTGTAAAACGACGGCCAGTTGATTCTCAGTCTCTGTGGT 3* 
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a-lac Beq. 2 5 • AGCATCAGGAAACAGCTATGACCTGGGTGGCATGGAATAGGAT 3' 

The primers listed above contain a portion of 
the a-lactalbumin gene as well as both M13 DNA sequencing 
primers. The primers are designed to allow for DNA 
sequencing to be performed in both directions on the PCR 
DNA product. The final PCR product will contain the 
region of a-lactalbumin containing the four genetic 
variants^ the two M13 sequencing priming regions and 5 
"dummy bases" on the end to aid in the M13 primer 
binding . 

Comparison of Highly Conserved Portions of the 5' 
Flanking Region of a-Lactalbumin Between Species 

Reference is made to Figs. 8-10 for 
DOTBLOT™ graphs comparing the bovine a-lactalbumin 
sequence to the same region of the human (Fig. 8) , guinea 
pig (Fig. 9) ^ and rat (Fig. 10) . The region in Fig. 8 
(human) spans 819 base pairs. The sequences are highly 
conserved to about 700 base pairs. The region in Fig. 9 
(guinea pig) spans 1381 base pairs. The sequences are 
highly conserved to about 700 base pairs ^ but then 
diverge. The region in Fig, 10 (rat) spans 1337 base 
pairs. The sequences are highly conserved to about 700 
base pairs, but then diverge. Species differences in 
control regions would be expected to occur in non- 
conserved regions of the sequence. 

Comparison of 5 ' Flanking Region of Bovine a-Lactalbumin 
to Other Bovine Milk Protein Genes 

Portions of the 5« flanking region of the 
other bovine milk protein genes (asl and as2 casein, 
B-casein, X-casein and B-lactoglobulin) which are highly 
conserved with the a-lactalbumin 5 • flanking region were 
identified. It is probable that sequence differences 
within these regions will also have an effect on mKNA 
production as well as final protein production. Two 
examples of these highly homologous regions are listed 
below. 

The bovine a-lactalbumin sequence from (—161) 
- (-115) (SEQ ID NO: 17) compared to the bovine B-caseir* 



sequence (SEQ ID NO: 18) corresponding to the same region 

of the gene. Percent similarity is 69% over 46 bases. 

AGGAAGCTCAATGTTTCTTTGTTGGTTTTACTGGCCTCTCTTGTCA 
(III III I i I i I I II I I I I I I I I I II It til 
I 1 I I 111 I I I I I I II I I I 1 I I 1 I I II II III 

AGGAGGCT . ATTCTTTCCTTTTAGTCTATACTGTCTTCGCTCTTCA 

The bovine a-lactalbximin sequence (SEQ ID 
NO: 19) from (-1420) - (-1351) is compared to the bovine 
fl-casein sequence (SEQ ID NO: 20) corresponding to the 
same region of the gene. Percent similarity is 75% over 
69 bases. 

TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCA 

i I I I I I I I I I fill I I 1 I I I I I I I 1 I 

I I I I I t I I t I I I I I I I I I I I I t I I I I 

TCTCAGAAATCACACTTTTTTGCCTGTG • GCCTTGGCA 

TACCAGAAGCTAACAGCTA 
I I i I I I I I I I I r t 11 
I I I I I 1 I I 1 I I I I 11 

. ACCAAAAGCTAACACATA 

The included data indicate that the bovine 
a-lactalbumin gene will be useful as selection tool in 
the dairy cattle industry as well as a valuable 
control /enhancer and gene to be used in the field of 
genetically engineered mammals. The control region we 
have cloned contains the necessary regulatory elements to 
express genes in the milk of genetically engineered 
mammals as well as the "high expressing genotype" as 
shown by our milk and milk protein production and 
sequence variation data. These facts make this a useful 
gene in both industrial and research areas. Application 
of these techniques to the other milk proteins will allow 
for the selection of valuable genotypes corresponding to 
the B-casein, asj^- and as2"Casein and JC-casein genes and 
the B-lactoglobulin genes. 
Coding Region 

The coding region of the a-lactalbumin 
protein includes a 1.7 kilobase sequence. 
3 ' Flanking Region 

The 3' flanking region is an 8.8 kilobase 
flanking region downstream of the DNA sequence coding for 
the desired recombinant protein. This region apparently 
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stabilizes the RNA transcript of the expression system 
and thus increases the yield of desired protein from the 
expression system. 
Operation 

The above-described expression systems may be 
prepared by methods well-known in the art. Examples 
include various ligation techniques employing 
conventional linkers, restriction sites, etc. 
Preferably, these expression systems are part of larger 
plasmids • 

After isolation and purification, the 
expression systems or constructs are added to the gene 
pool which is to be genetically altered. 

The methods for genetically engineering 
mammals are well-known to the art. Reference is made to 
to Alberts, B. et al., 1989 and Lewin, B. 1990, for 
textbook descriptions of genetic engineering and 
transgenic alteration of animals. Briefly, genetic 
engineering involves the construction of expression 
vectors so that a cDNA clone or genomic structure is 
connected directly to a DNA sequence that acts as a 
strong promoter for DNA transcription. By means of 
genetic engineering, mammalian cells, such as mammeory 
tissue, can be induced to make vast quantities of useful 
proteins . 

For the purposes of this invention, the term 
"genetic engineering," as defined supra . in the list of 
definitions, includes single line alteration, !• e. , 
genetic alteration only during the life of the affected 
animal with no germ line permanence. The construct can 
be genetically incorporated in memimalian glands such as 
mammary glands and mammalian stem cells. 

Genetic engineering also includes transgenic 
alteration, i. e. the permanent insertion of the gene 
sequence into the genomic structure of the affected 
animal and any offspring. Transgenically altering a 
mammal involves microinjecting a DNA construct into the 



pronuclei of the fertilized roammalian egg to cause one or 
more copies of the construct to be retained in the cells 
of the developing mamiaal. In a transgenic animal, the 
engineered genes are permanently inserted into the germ 
line of the animal. 

The genetically engineered mammal is then 
characterized by an expression system comprising the a- 
lactalbumin control region operatively linked to an 
exogenous DNA sequence coding for the recombinant protein 
through a DNA sequence coding for a signal peptide 
effective in secreting and maturing the recombinant 
protein in mammary tissue. In order to produce and 
secrete the recombinant protein into the mammal's milk, 
the transgenic mammal must be allowed to produce the 
milk, after which the milk is collected. The milk may 
then be used in standard manufacturing processes. The 
exogenous recombinant protein may also be isolated from 
the milk according to methods known to the art. 
Selection Characteristics 

The a-lactalbumin control/enhancer sequence 
of Fig. 1 is also important as a selection characteristic 
for identifying superior or elite milk producing mammals. 
Presently, those in the dairy cattle business can only 
rely on pedigree information, which is frequently not 
available, to predict milk and milk protein production in 
mammals, specifically the bovine species. The study of 
physiological markers as a means for determining milk and 
milk protein production has received some interest. The 
most common physiological marker traits studied in dairy 
cattle are hormones, enzymes, and different blood 
metabolites. Components of the immune system have also 
been studied. Traits listed as possible marker traits 
for milk yield include thyroxine, blood urea nitrogen, 
growth hormones, insulin-like growth factors and insulin, 
and glucose and free fatty acids. While these techniques 
have shown some advances in predicting milk and milk 
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protein production in a dairy animal, there is currently 
no other reliable means to predict these characteristics. 

The present invention provides a selection 
characteristic for identifying superior milk and milk 
protein-producing mammals comprising inherited genetic 
material which is DNA occurring in the genetic structure 
of the mammal in which the genetic material encodes a 
dominant selectable marker for bovine a-lactalbumin. 

The DNA sequence disclosed herein serves as a 
chcoracteristic marker for elite milk producing mammals. 

The examples below describe the invention 
disclosed herein, although the invention is not to be 
understood as limited in any way to the tenas and scope 
of the examples. 

EXAMPLES 

Example 1: a-lac (-13) variation study. 

Forty- two mammals were selected in a 
stratified random manner to provide mammals of a wide 
range of milk and milk protein production capabilities 
within the UW herd. 

DNA was isolated according to procedures 
known to the art from a random sample of 42 Holstein 
dairy cows in the University of Wisconsin-Madison herd. 
Each mammal was genotyped as described previously for the 
a-lactalbumin (-13) variation using a 4% NuSieve gel of 
Mnll digested PGR products - 

The gene frequency in this population is 28% 
for the a-lac (-13) A and 72% for the a-lac (-13) B. 
Each of the distinct genotypes are shown on the gel in 



Fig- 12. 


The legend for the gel 


of Figure 12 


is as 




follows: 










Lane 1 


Molecular Weight Standards 






Lane 2-3 


het er o zy gous 


a-lac 


(-13) 


AB 


Lane 4: 


homozygous 


a-lac 


(-13) 


BB 


Lane 5 


heterozygous 


a-lac 


(-13) 


AB 


Lane 6 


homozygous 


a-lac 


(-13) 


BB 


Lane 7 


homozygous 


a-lac 


(-13) 


AA 
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Lane 8 heterozygous a-lac (-13) AB 

Analysis of the genetic capabilities of the 
42 ncumnals indicates a possible major gene effect caused 
by the a-lac (-13) allele or linked to the a-lac (-13) 
5 allele. A scatter plot of each data point as well as 

mean values for each of the three genotypes is 
illustrated in Fig. 13. Holstein cows were compared 
using their predicted transmitting ability for milk. 

The data indicate that the a-lac (-13) A 

10 genotype is the preferred genotype for milk and milk 

protein production. Table 1 shown below indicates the 
statistical association of differences in milk and milk 
protein production ability observed between each of the 
genotypes for the traits listed below. Analysis of 

15 variance and T tests (LSD) were performed on the data. 

All of the production yield traits were positively 
correlated with the a-lac (-13) A allele. Milk protein 
percentage was negatively correlated to the a-lac (-13) A 
allele. 

20 Table 1 

Trait / Genotype Genotype 

a-lac (-13) AA a-lac (-13) AB a-lac (-13) BB 

PTA (Milk) /AA N.S. p<0.02 

PTA (Milk)/AB N.S. p<0.02 

25 ME305 (Milk)/AA N.S. N.S. 

ME305 (Milk)/AB N.S. p<0.1 

PTA (Protein #) /AA N.S. N.S. 

PTA (Protein #)/AB N.S. p<0.1 

PTA (Protein %)/AA N*S. p<0.01 

30 PTA (Protein %)/AB N.S. p<0.01 

Example 2. Production of Transgenic mice to study the 
regulation of bovine a-lactalbumin gene expression. 
Genomic Library Screening; 

The gene encoding the milk protein bovine 
35 a-lactalbumin was isolated from a bovine genomic library 

(Woychik, 1982) . The genomic library was screened 



according to the following procedure. Approximately 1.5 
million lambda plaques were transferred to nylon 
membranes using procedures described by Maniatis et al. 
(1989) . The a-lactalbumin cDNA (Hurley, 1987) or a 770 
base pair PGR product was nick translated (BRL) with 
a-P32 labeled dCTP. Blots were prehybridized overnight 
(65C) then hybridized for 16 hours at 65C. Blots were 
washed (Twice in 2X SSC 1% SDS, Once in O.IX SSC 0.1% 
SDS) at 65C and placed on Kodak X-OMAT film for 
autoradiography. A 8-0 kilobase fragment containing the 
a-lactalbumin gene was purified as illustrated in Fig. 4. 
The 8.0 kilobase fragment contained 2.1 kilobases of 5' 
flanking region, the 1-7 kilobase coding region and 2.6 
kilobases of 3' flanking region. 
Production of transgenic mice; 

Mature C57B6 X DBA2J Fl (B6n2) female were 
superovulated (PMSG and hCG) and mated with ICR or B6D2 
males to yield fertilized eggs for pronuclear 
microinjection. The eggs were microinjected using a 
LeitE micromanipulator and a Nikon inverted microscope. 
Forty noraal appearing two cell embryos were transferred 
to each pseudopregnant recipient- (University of 
Wisconsin-Madison Biotechnology Center Transgenic Mouse 
Facility, Dr. Jan Heideman) . 
Screening of transgenic mice usi ng PCR; 

Tail DNA was extracted using the method 
described by Constantini et al. (1986). Polymerase chain 
reaction (PCR) was performed using 10 ml lOx PCR reaction 
buffer (Promega Corp., Madison, WI.), 200 mM each dNTP 
(Pharmacia Intl., Milwaukee, WI.), 1.0 tim each primer 
(upstream primer 25mer -712 to -687 (5» 

CAATGTGGTATCTGGCTATTTAGTG 3') (SEQ ID NO: 14), downstream 
primer 20mer +39 to +59 (5« AGCCTGGGTGGCATGGAATA 3') (SEQ 
ID NO: 15), 1 unit Taq DNA polymerase (Promega Corp., 
Madison, WI.) and Img genomic DNA. Volume was adjusted 
to 100 ml with double distilled sterile water and 
reaction was overlaid with heavy mineral oil. Samples 



were siibjected to 30 cycles (94C 2 min. , 50C 1,5 min. , 
720 1.5 min.). Products were run in an 1% agarose gel 
and stained with ethidium bromide. 
Mouse Milking: 

The mice were separated from their litters 
for four hours and then anesthetized (0.01 ml/g body 
weight I. P. injection of 36% propylene glycol, 10.5% 
ethyl alcohol (95%), 41.5% sterile water, and 12% sodium 
pentabarbitol (50 mg/ml) ) . After being anesthetized the 
mice were injected I.M. with 0.3 I.U. oxytocin and milked 
using a small vacuum milking machine. Three of fifty-one 
live offspring were identified as being transgenic using 
polymerase chain reaction. Reference is made to Fig. 14 
for a graph illustrating expression levels observed in 
each of the 3 a-lactalbumin transgenic mouse line. 
JELISA; 

Second generation mammals from one line were 
milked and analysis was performed using an ELISA (enzyme 
linked immunosorbent assay) for bovine a-lactalbumin 
according to the following procedure: 

1. Coat l/40k bovine a-lactalbumin 
antiserum 100 ml per well (in 0.05M carbonate buffer, pH 
9.6) on Nunc-Immuno Plate IF MaxiSorp. 

2. Wash 4x with wash buffer (0.025% Tween 
20 in PBS pH 7.2) 

3. Add 50 ml assay buffer (0.04M MOPS, 
0.12M NaCl, O.OIM EDTA, 0.1% gelatin, 0.05% Tween 20, 
0.005% chlorhexidine digluconate, Leupeptin 50 mg/ml, pH 
7.4). 

4. Add 50 ml of standards and Scunples (in 
assay buffer) in triplicate. 

5. Add 50 ml 1/lOOk diluted a-lactalbumin 
biotin conjugate. 

6- Incubate overnight at 4C 

7. Wash 4x with wash buffer 

8. Add 100 ml 1/lOk assay buffer diluted 
ExtrAvidin-peroxidase (Sigma) . Inctibate 2 hours at RT. 



9. Wash 4x twice with wash buffer. 

10. Add 125 ml fresh substrate buffer (200 
ml tetramethylbenzidine 20 mg/ml) DMSO, 64 ml 0.5M 
hydrogen peroxide, 19.74 ml sodium acetate, pH 4.8). 

11. Incubate for 12 minutes at RT. 

12. Add 50 ml 0.5M sulfuric acid to stop 
substrate reaction. 

13. Read absorbance at 450 nm minus 600 nm 
in an EIA autoreader. 

Bovine a-lactalbumin was present at a 
concentration of levels up to and beyond 1.0 mg/ml mouse 
milk. Expression was determined by Western Blotting in 
the following steps. The 14% PAGE gel was transfered to an 
Immobilon-P membrane (Millipore) , which was blocked in 
0.02 M sodiumphosphate, 0.12M NaCl, 0.01% gelatin, 0.05% 
Tween 20,- pH=7.2, and incubated with anti-bovine 
a-lactalbumin (1/2000 dilution) for 2 hours at room 
temperature. The gel was washed twice (2 min.) with an 
ELISA wash buffer and incubated with goat anti-rabbit 
IgG-HRP for 2 hours at room temperature, followed by 
washing 3 times with a wash buffer and washing once with 
double-distilled water. The gel was placed in a substrate 
solution (25 mg 3 , 3 ' -diaminobenzidine, 1 ml 1% C0CI2 in 
H2O, 4 9 ml PBS pH 7.4 and 0.05 ml 3 0% H2O2) and monitored 
for color development. The membrane was air dried. 

It is understood that the invention is not 
confined to the particular constructions and arrangements 
herein illustrated and described, but embraces such 
modified forms thereof as come within the scope of the 
claims following the bibliography. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: BLECK, GREGORY T. 

BREMEL, ROBERT D. 

(11) TITLE OF INVENTION: DNA SEQUENCE ENCODING BOVINE 
ALPHA-LACTALBUMIN AND METHODS OF USE 

(111) NUMBER OF SEQUENCES: 20 

(Iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: ANDRUS, SCEALES, STARKE & SAWALL 

(B) STREET: 100 E. WISCONSIN AVE. , SUITE 1100 

(C) CITY: MILWAUKEE 

(D) STATE: WI 

(E) COUNTRY: USA 

(F) ZIP: 53202-4178 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release i^l.O, Version #1.25 

(vl) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vlll) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Sara, Cheucles S 

(B) REGISTRATION NUMBER: 30,492 

(C) REFERENCE/DOCKET NUMBER: F- 3262-1 

(Ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (608) 255-2022 

(B) TELEFAX: (608) 255-2182 

(C) TELEX: 26832 ANDSTARK 



(2) INFORMATION FOR SEQ ID NO:l: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA (genomic) 



(xl) SEQUENCE DESCRIPTION! SEQ ID NOxl: 
ATGACCATGA TTACGAATTC ATCGTA 
(2) INFORMATION FOR SEQ ID NO: 2: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 88 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
GAACAGTTAT CTAGATCTCG AGCTCGCGAA AGCTTGCATG CCTGCAGGTC 6ACTCTAGA6 60 
GATCCCC6GG TACCGAGCTC GAATTCAC 88 
5 (2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2044 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: signal peptide coding region 

(B) LOCATION: 1943.. 2043 

15 (ix) FEATURE: 

(A) NAME/KEY: inherited control region for a-lactalbumin 

(B) LOCATION: 1966 

(ix) FEATURE: 

(A) NAME/KEY: putative steroid response element 
20 (B) LOCATION: 1433.. 1446 

(ix) FEATURE: 

(A) NAME/KEY: RNA polymerase binding region 

(B) LOCATION: 1961.. 1978 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
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GATCAGTCCT 


GGGTGGTCAT 


TGAAAGGACT 


GATGCTGAAG 


TTGAAGCTCC 


AATACTTTGG 


60 


CCACCTGATG 


CGAAGAACTG 


ACTCATGTGA 


TAAGACCCTG 


ATACTGGGAA 


AGATTGAAGG 


120 


CAGGAGGAGA 


AGGGATGACA 


6AGGATGGAA 


GAGTTGGATG 


GAATCACCAA 


CTCGAT6GAC 


180 


ATGAGTTTGA 


GCAAGCTTCC 


AGGAGTTGGT 


AATGGGCAGG 


GAAGCCTGGC 


GTGCTGCAGT 


240 


CCATGGGGTT 


GCAAAGAGTT 


GGACACTACT 


GAGTGACTGA 


ACTGAACTGA 


TA6TGTAATC 


300 


CATG6TACAG 


AATATAGGAT 


2UVAA2UV6AGG 


AAGAGTTTGC 


CCTGATTCTG 


AAGA6TT6TA 


360 


GGATATAAAA 


GTTTAGAATA 


CCTTTAGTTT 


6GAAGTCTTA 


AATTATTTAC 


TTA66ATGG6 


420 


TACCCACTGC 


AATATAAGAA 


ATCAGGCTTT 


AGAGACTGAT 


GTAGAGAGAA 


T6A6CCCTGG 


480 


CATACCAGAA 


GCTAACaVGCT 


ATTGGTTATA 


GCTGTTATAA 


CCAATATATA 


ACCAATATAT 


540 


TGGTTATATA 


GCATGAAGCT 


TGATGCCAGC 


AATTTGAA6G 


AACCATTTAG 


AACTAGTATC 


600 


CTAAACTCTA 


CATGTTCCAG 


GACACTGATC 


TTAAAGCTCA 


GGTTCA6AAT 


CTTGTTTTAT 


660 


AGGCTCTAGG 


TGTATATTGT 


GGGGCTTCCC 


TGGTGGCTCA 


GATGGTAAAG 


TGTCTGCCTG 


720 


CAATGTGGGT 


GATCTGGGTT 


CGATCCCTGG 


CTTGGGAAGA 


TCCCCTGGAG 


AAGGAAATGG 


780 


CAACCCACTC 


TAGTACTCTT 


ACCTGGAAAA 


TTCCATGGAC 


AGAGGAGCCT 


TGTAAGCTAC 


840 
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AGTCCATGGG ATTGCAAAGA GTTGAACACA ACTGAGCAAC TAAGCACAGC ACAGTACAGT 900 

ATACACCTGT GAGGTGAAGT GAAGTGAAGG TTCAATGCAG GGTCTCCTGC ATTGCAGAAA 960 

6ATTCTTXAC CATCTGAGCC ACCAG6GAAG CCCAAGAATA CTGGAGT66G TAGCCXATTC 1020 

CTTCTCCAGG 6GATCTTCCC ATCCCAG6AA TTGAACT66A GTCTCCT6CA TTTCAGGTGG 1080 

5 ATTCTTCACC A6CTGAACTA CCAGGTGGAT ACTACTCCAA TATTAAAGT6 CTTAAAGTCC 1140 

AGTTTTCCCA CCTTTCCCAA AAAGGTTGGG TCACTCTTTT TTAACCTTCT 6TGGCCTACT 1200 

CTGAGGCTGT CTACAAGCTT ATATATTTAT GAACACATTT ATTGCAAGTT GTTAGTTTTA 1260 

GATTTACAAT GTGGTATCTG GCTATTTAGT GGTATTGGTG 6TTGGGGATG 6G6A6GCTGA 1320 

TA6CATCTCA GAGGGCAGCT AGATACTGTC ATACACACTT TTCAA6TTCT CCATTTTTGT 1380 

10 GAAATAGAAA GTCTCTGGAT CTAA6TTATA TGTGATTCTC AGTCTCTGTG GTCATATTCT 1440 

ATTCTACTCC TGACCACTCA ACAAGGAACC AAGATATCAA GGGACACTTG TTTTGTTTCA 1500 

TGCCTGGGTT GAGTGGGCCA TGACATATGA TGATGTACAG TCCTTTTCCA TATTCTGTAT 1560 

OTCTCTAAGA GGAAGGAGGA GTTGGCCGTG GACCCTTTGT GCATTTTCTG ATTGCTTCAC 1620 

TTGTATTACC CCT6AGGCCC CCTTTGTTCC TGAAATAGGT TGGGCACATC TTGCTTCCTA 1680 

15 GAACCAACAC TACCAGAAAC AACATAAATA AAGCCAAATG GGAAACAGGA TCATGTTTGT 1740 

AACACTCTTT GGGCAGGTAA CAATACCTAG TATGGACTAG AGATTCTGGG GAGGAAAGGA 1800 

AAAGTGGGGT GAAATTACT6 AAGGAAGCTC AATGTTTCTT TGTTGGTTTT ACTGGCCTCT 1860 

CTTGTCATCC TCTTCCTGGA TGTAAGGCTT GATGCCAGGG CCCCTAAGGC TTTTTCCACA 1920 

AATAAAAGGA GGTGAGCAGT GTGGTGACCC CATTTCAGAA TCTTGAGGGG TAACCAAAAT 1980 

20 GATGTCCTTT GTCTCTCTGC TCCTGGTAGG CATCCTATTC CATGCCACCC AGGCTGAACA 2040 

GTTA 2044 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 14 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOIiOGY: linear 

(ii) HOLECUI.E TYPE: DNA (genomic) 

(xi) SEQtTENCE DESCRIPTION: SEQ ID NO: 4: 
30 CATATTCTAT TCTA 14 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5 
CATATTCTAT TCCTA 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
{D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6 
CATATTCTAT TTCTA 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 
TCTTGAGGGG TAACCAAA 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
{D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8 
TCTTGGGGGT AGCCAAA 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9 



TCTTGGGGrGG TCACCAAA 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA- (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
AC6CXTGTAA AACGACGGCC AGTTGATTCT CAGTCTCTGT GGT 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
AGCATCAGGA AACAGCTATG ACCTGG6TGG CATGGAATAG GAT 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRTOIDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CTCTTCCTGG ATGTAAGGCT T 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
TCCTGGGTGG TCATTGAAAG GACT 
(2) INFORMATION FOR SEQ ID NO: 14: 



(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 25 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CAATGTG6TA TCTGGCTATT TAGTG 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
AGCCTGGGTG GCATGGAATA 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
6AAACGCGGT ACAGACCCCT 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
AGGAAGCTCA ATGTTTCTTT GTTGGTTTTA CTGGCCTCTC TTGTCA 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 



(C) STRANDEDNESS: single 

(D) TOPOLOGY s linear 



(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 18: 
A66AGGCTAT TCTTTCCTTT TAGTCTATAC TGTCTTCGCT CTTCA 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 69 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single ' ' 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
TATAAGAAAT CAGGCTTTAG AGACTGATGT AGAGAGAATG AGCCCTGGCA TACCAGAAGC 
TAACAGCTA 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 55 base pairs 

(B) TYPE: nucleic acid 

(C) STRT^DEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
TCTCAGAAAT CACACTTTTT TGCCTGTGGC CTTGGCAACC AAAAGCTAAC ACATA 
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CLAIMS 
What is claimed is: 

1. A mammary specific DNA sequence encoding 
bovine a-lactalbximin and promoting quantitative 
differences in gene expression among mammals, wherein the 
DNA sequence is characterized by variations in the gene 

5 structure in the control region of bovine a-lactalbximin. 

2. The DNA sequence of claim 1 wherein one of 
the variations is in the -13 position of the DNA 
sequence. 

3. The DNA sequence of claim 2 wherein the -13 
position is occupied by adenine. 

4. The DNA sequence of claim 1 comprising the 
following DNA sequence (SEQ ID NO: 19) in the control 
region of bovine a-lactalbumin: 

TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCATACCAGA 

5 AGCTAACAGCTA. 

5. The DNA sequence of claim 1 comprising the 
following DNA sequence (SEQ ID NO: 17) in the control 
region of bovine a-lactalbumin: 

AGGAAGCTCAATGTTTCTTTGTTGGTTTTACTGGCCTCTCTTGTCA. 

6. The DNA sequence of claim 1 comprising the 
DNA sequence listed in Fig. 5 (SEQ ID NO: 3) in the 
control region of bovine a-lactalbumin. 

7. An expression vector comprising the DNA 
sequence of claim 1. 

8. An expression system comprising a mammary 
specific a-lactalbumin control region constiruct which 
when genetically incorporated into a mammal peznoaits the 
female species of that mammal to produce the desired 

5 recombinant protein in its milk. 

9. The expression system of claim 8 which 
comprises at least one a-lactalbumin control region 
construct operatively linked to a DNA sequence coding for 
a signal peptide and a-lactalbiunin. 
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10. The expression system of claim 8 which 
comprises a 3 * flanking region downstream of the DNA 
sequence coding for a-lactalbumin. 

11. The expression system of claim 10 wherein the 
construct includes a 5» flanking region upstream of the 
DNA sequence coding for the signal peptide. 

12. The expression system of claim 8 wherein the 
construct comprises a 5* a-lactalbumin flanking region 
attached to a bovine fi-casein gene. 

13. The expression system of claim 12 wherein the 
construct contains the polyadenylation site of B-casein 
and approximately 100 base pairs of 5' a-lactalbumin 
flanking region. 

14. The expression system of claim 12 wherein the 
construct includes the proximal promoter from a first 
milk protein and the distal control region of a second 
milk protein. 

15. The expression system of claim 14 wherein the 
first and second milk proteins are selected from the 
group consisting of a-lactalbvunin, B-casein^ as^^— casein, 
as2-casein and JT— casein. 

16. The expression system of claim 12 wherein the 
construct includes the proximal promoter of B-casein and 
the distal control region of a-lactalbumin. 

17. A genetically engineered mammal characterized 
by an expression system comprising an a-lactalbumin 
control region operatively linked to an exogenous DNA 
sequence coding for a desired protein to be expressed in 

^ milk through a DNA sequence coding for a signal peptide 

effective in secreting and maturing the protein in 
mcunmeory tissue. 

18. The genetically engineered mammal of claim 17 
wherein the a-lactalbumin control region includes a 
meunmary specific DNA sequence encoding bovine a- 
lactalbximin and having the following nucleotide sequence 

5 (SEQ ID NO: 19) in the control region of bovine a- 

lactalbumin : 
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TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCATACCAGA 
AGCTAACAGCTA. 

19 . The genetically engineered mammal of claim 17 
wherein the a-lactalbumin control region includes a 
mammary specific DNA sequence encoding bovine a- 
lactalbumin and having the following nucleotide sequence 

5 (SEQ ID NO: 20) in the control region of bovine a- 

lactalbumin: 

AGGAAGCTCAATGTTTCTTTGTTGGTTTTACTGGCCTCTCTTGTCA. 

20. Products produced by the genetically 
engineered maonmal of claim 17 . 

21. Semen produced by the genetically engineered 
mammal of claim 17 • 

22. Milk produced by the genetically engineered 
mammal of claim 17 . 

23. - A transgenic mammal of claim 17. 

24. A DNA sequence coding for a-lactalbumin which 
is operatively linked in an expression system of a 
mammary specific a-lactalbumin protein control region, or 
any control region which specifically activates a- 

5 lactalbumin in milk or in mammary tissue, through a 

signal peptide that permits secretion and maturation of 
the a-lactalbumin in the mammary tissue. 

25. The DNA sequence of claim 24 comprising the 
following DNA sequence (SEQ ID NO: 19) in the control 
region of bovine a-lactalbumin: 

TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCATACCAGA 

5 AGCTAACAGCTA. 

26. The DNA sequence of claim 24 comprising the 
following DNA sequence (SEQ ID NO: 20) in the control 
region of bovine a-lactalbumin: 

AGGAAGCTCAATGTTTCTTTGTTGGTTTTACTGG CCTCTCTTGTCA . 

27. The DNA sequence of claim 24 comprising the 
DNA sequence (SEQ ID NO: 3) listed in Fig. 5 in the 
control region of bovine a-lactalbumin - 

28. A process for genetically engineering the 
incorporation of one or more copies of a construct 
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comprising a a-lactalbumin control region or any control 
region sequence specifically activated in mammary tissue, 
5 operatively linked to a DNA sequence coding for a desired 

recombinant protein through a DNA sequence coding for a 
signal peptide that permits the secretion and maturation 
of a-lactalbumin in the mammary tissue 

29 • The process of claim 28 wherein the construct 

is genetically incorporated into a mammal and the 
recombinant protein product is subsequently expressed and 
secreted into or along with the milk of the lactating 
5 genetically engineered mammal. 

30. The process of claim 29 wherein the construct 
is generally incorporated in meunmalian embryos, mammalian 
mammary glands, or mammalian stem cells. 

31. The process of claim 28 wherein the mammals 
are cows; sheep, goats, mice, oxen, camels, water 
buffaloes, llamas and pigs* 

32. A process for the production and secretion 
into mammal's milk of an exogenous recombinant protein 
comprising the steps of: 

a. producing milk in a genetically 
5 engineered maiamal characterized by an 

expression system comprising a-lactalbumin 
control region operatively linked to an 
exogenous DNA sequence coding for the 
recombinant protein through a DNA sequence 
10 coding for a signal peptide effective in 

secreting and maturing the recombinant 
protein in mammary tissue; 

b. collecting the milk; and 

c. isolating the exogenous recombinant 
15 protein from the milk. 

33. The process according to claim 32, wherein 
said expression system also includes a 3 ' flanking region 
coding for a— lactalbumin downstream of the DNA sequence. 

34. The process according to claim 32, wherein 
said expression system also includes a 5' flanking region 



coding for the signal peptide upstream of the DNA 
sequence • 

35. A selection characteristic for identifying 
superior milk producing mammals comprising a DNA sequence 
encoding bovine a-lactalbumin and having the DNA sequence 
(SEQ ID NO: 3) listed in Fig. 5 in the control region of 
bovine a-lactalbumin. 

36. A selection characteristic for identifying 
superior miUc producing maimmals comprising inherited 
genetic material which is a mammary specific DNA sequence 
encoding bovine a-lactalbumin and promoting quantitative 
differences in gene expression cunong mammals, wherein the 
DNA sequence is characterized by variations in the gene 
structure in the control region of bovine a-lactalbumin: 

37. The selection characteristic of claim 36 
wherein one of the variations is in the -13 position of 
the DNA sequence. 

38. The selection characteristic of claim 37 
wherein the -13 position is occupied by adenosine. 

39. The selection characteristic of claim 3 6 
comprising the following DNA sequence (SEQ ID NO: 19) in 
the control region of bovine a-lactalbumin: 

TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCATACCAGA 
AGCTAACAGCTA. 

40- The selection characteristic of claim 36 

comprising the following DNA sequence (SEQ ID NO: 20) in 
the control region of bovine a-lactalbumin: 
AGGAAGCTCAATGTTTCTTTGTTGGTTTTACTGGCCTCTCTTGTCA. 

41. The selection characteristic of claim 36 
comprising the DNA sequence (SEQ ID NO: 3) listed in Fig. 
5 in the control region of bovine a-lactalbumin. 

42. A method of predicting superior milk and millc 
protein production in mammals comprising comparing 
selected positions on the DNA sequence of the inherited 
control region for a-lactalbumin in a subject mammaml 
with analogous positions on the DNA sequence of the 
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control region for a-lactalbumin from mammals knovm for 
superior milk and milk protein production • 

43. The method of claim 42 wherein one of the 
selected positions is the -13 position on the control 
region of the DNA sequence. 

44. The method of claim 43 wherein the -13 
position is occupied by the base adenine. 

45. The method of claim 42 wherein the selected 
DNA sequence comprises a steroid response element. 



AMSIDED CLAIMS 

[received by the International Bureau on 14 DecCTiber 1992 (14.12.92); 
original claims 2,5,9-11,19,24-31,33,35-45 cancelled; 
original claims 1,3,6,8,12,13,15,17,18,20-23,32 amended; 
other claims tjnchanged ( 4 pages ) ] 

1. An isolated DNA seqnience which promotes mamnary 
specific expression of mRNA in mammary cells of lactating animals 
comprising a variant of bovine a-lactalbtimin 5' flanking region 
regulatory sequence wherein the variant is in the -13 position 
from the start of the signal peptide coding region. 

3. The DHA sequence of claim 1 wherein the -13 
position is occupied by adenine. 

4 • The DNA sequence of claim 1 comprising DNA sequence 
(SEQ ID NO: 19) in the control region of bovine a-lactalbumini 
TATAAGAAATCA6GCTTTAGAGACTGATGTA6AGAGAATGAGCCCTGGCATACCAGAAGCTA 
ACAGCTA. 

6. The DNA sequence of claim 1 comprising the DNA 
secpience listed in Fig. 5 (SEQ ID NO: 3) as the 5' flanking region 
of bovine or- 1 acta 3 bumi n. 

7. An expression vector comprising the DNA sequence 
of claim 1. 

8. An expression system comprising a msunmary specific 
a-lactalbumin control region construct operatively linked to a 
DNA sequence coding for a signal peptide^ which when genetically 
incorporated into a non-human mammal permits the female species 
of that mammal to produce the desired recombinant protein in its 
milk. 
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12. The expression system of claim 8 wherein tlie 
construct comprises a 5' a-lactalbiimin flanking region attached 
to a bovine fi-casein DNA sequence. 

13. The expression system of claim 12 wherein the 
construct contains the polyadenylation site of fi-casein 5' 
flamking region. 

14. The expression system of claim 12 wherein the 
construct includes the proximal promoter from a first milk 
protein and the distal control region of a second milk protein. 

15. The expression system of claim 14 wherein the 
first and second milk proteins are selected from the group 
consisting of a-lactalbumin, fi-lactoglobulin, B-casein, aS|- 
casein, asj-casein and iC-casein. 

16. The expressions system of claim 12 wherein the 
construct includes the proximal promoter of B-casein and the 
distal control region of a-lactalbumin. 

17. A transgenic non-human mammal containing an 
expression system comprising the DNA sequence listed in Fig. 5 
(SEQ ID NO: 3) as the a-lactalbumin control region operatively 
linked to an exogenous DNA sequence coding for a desired protein 
to be expressed in milk through a DNA sequence coding for a 
signal peptide effective in secreting and maturing the protein 
in meunmary tissue. 

18. The tr£uisgenic non-human maunmal of claim 17 
wherein the a-lactalbumin control region includes a mammary 
specific DNA sequence encoding bovine a-lactalbumin emd having 
the following nucleotide sequence (SEQ ID NO: 19) in the control 
region of bovine a-lactalbumin: 

TATAA6AAATCA66CTTTA6A6ACT6AT6TA6A6A6AAT6A6CCCT6GCATACCA6AA6CTA 
ACAGCTA. 
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20. Products produced by the genetically engineered 

non— human manunal of claim 17. 

21. Semen produced by the tramsgenic non-human mammal 

of claim 17. 

22. Milk produced by the transgenic non-humaui mammal 
of claim 17. 

23. A transgenic non— human maonmal of claim 17. 
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32* A process for the production and secretion into 
non-htman maimnal^s milk of an exogenous recombinant protein 
comprising the steps of: 

a. producing milk in a genetically engineered mammal 
containing an expression system comprising a 
mammary specific a-lactalbumin control region 
operatively linked to a DNA sequence coding for 
a signal peptide, %iAiich is effective in secreting 
anA maturing the recombinant protein in meunmsury 
tissue; 

b. collecting the milk; and 

c. isolating the exogenous recombinant protein from 
the milk. 

34. The process according to claim 32, wherein said 
expression system also includes a 5^ flanking region coding for 
the signal peptide upstream of the DNA sequence. . 
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